Fault Tolerance in Tandem Computer Systems
نویسندگان
چکیده
Tandem builds single-fault-tolerant computer systems. At the hardware level, the system is designed as a loosely coupled multi-processor with fail-fast modules connected via dual paths. It is designed for online diagnosis and maintenance. A range of CPUs may be inter-connected via a hierarchical fault-tolerant local network. A variety of peripherals needed for online transaction processing are attached via dual ported controllers. A novel disc subsystem allows a choice between low cost-per-megabyte and low cost-per-access. System software provides processes and messages as the basic structuring mechanism. Processes provide software modularity and fault isolation. Process pairs tolerate hardware and transient software failures. Applications are structured as requesting processes making remote procedure calls to server processes. Process server classes utilize multi-processors. The resulting process abstractions provide a distributed system which can utilize thousands of processors. High-level networking protocols such as SNA, OSI, and a proprietary network are built atop this base. A relational database provides distributed data and distributed transactions. An application generator allows users to develop fault-tolerant applications as though the system were a conventional computer. The resulting system has price/performance competitive with conventional systems.
منابع مشابه
Novel Defect Terminolgy Beside Evaluation And Design Fault Tolerant Logic Gates In Quantum-Dot Cellular Automata
Quantum dot Cellular Automata (QCA) is one of the important nano-level technologies for implementation of both combinational and sequential systems. QCA have the potential to achieve low power dissipation and operate high speed at THZ frequencies. However large probability of occurrence fabrication defects in QCA, is a fundamental challenge to use this emerging technology. Because of these vari...
متن کاملProposing an Efficient Software-based Method to Enhance Reliability of Computer Systems against Soft Errors
In recent years, along with rapid developments in technology, computer systems haveincreasingly become more integrated and more modular. Indeed, the reliability and efficiency ofcomputer systems are of high significance. Hence, the quantitative evaluation of the optimizationof reliability indexes in computer systems is considered to be a crucial issue. Reliabilityenhancement of computer systems...
متن کاملAn Introduction to Fault-Tolerant Systems
This report is an introduction to fault-tolerance concepts and systems, mainly from the hardware point of view. An introduction to the terminology is given, and different ways of achieving fault-tolerance with redundancy is studied. Knowledge of software fault-tolerance is important, so an introduction to software fault-tolerance is also given. Finally, some systems are studied as case examples...
متن کاملTANDEM COMPUTERS A Census of Tandem System Availability Between 1985 and 1990
Tandem computer systems are designed to be single-fault tolerant. This paper takes a census of customer system outages reported to Tandem. The census shows a clear improvement in the reliability of hardware and maintenance. It indicates that now (1989) software is the majority source of reported system outages (62%), followed by system operations (15%). This is a dramatic shift from the statist...
متن کاملA Census of Tandem System Availability Between 1985 and 1990
Tandem computer systems are designed to be single-fault tolerant. This paper takes a census of customer system outages reported to Tandem. The census shows a clear improvement in the reliability of hardware and maintenance. It indicates that now (1989) software is the majority source of reported system outages (62%), followed by system operations (15%). This is a dramatic shift from the statist...
متن کاملWhy Do Computers Stop and What Can Be Done About It?
An analysis of the failure statistics of a commercially available fault-tolerant system shows that administration and software are the major contributors to failure. Various approaches to software fault-tolerance are then discussed -notably process-pairs, transactions and reliable storage. It is pointed out that faults in production software are often soft (transient) and that a transaction mec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1986